Symbolic Dynamic Programming for First-order POMDPs
نویسندگان
چکیده
Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (approximately) optimal dynamic programming solutions. Much work in recent years has focused on improving the efficiency of these dynamic programming algorithms by exploiting symmetries and factored or relational representations. In this work, we show that it is also possible to exploit the full expressive power of first-order quantification to achieve state, action, and observation abstraction in a dynamic programming solution to relationally specified POMDPs. Among the advantages of this approach are the ability to maintain compact value function representations, abstract over the space of potentially optimal actions, and automatically derive compact conditional policy trees that minimally partition relational observation spaces according to distinctions that have an impact on policy values. This is the first lifted relational POMDP solution that can optimally accommodate actions with a potentially infinite relational space of observation outcomes.
منابع مشابه
Symbolic Dynamic Programming for Continuous State and Observation POMDPs
Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partiallyobservable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key in...
متن کاملSymbolic Dynamic Programming
A symbolic dynamic programming approach for solving first-order Markov decision processes within the situation calculus is presented. As an alternative specification language for dynamic worlds the fluent calculus is chosen and the fluent calculus formalization of the symbolic dynamic programming approach is provided. The major constructs of Markov decision processes such as the optimal value f...
متن کاملBounded Dynamic Programming for Decentralized POMDPs
Solving decentralized POMDPs (DEC-POMDPs) optimally is a very hard problem. As a result, several approximate algorithms have been developed, but these do not have satisfactory error bounds. In this paper, we first discuss optimal dynamic programming and some approximate finite horizon DEC-POMDP algorithms. We then present a bounded dynamic programming algorithm. Given a problem and an error bou...
متن کاملSymbolic Dynamic Programming within the Fluent Calculus
A symbolic dynamic programming approach for modelling first-order Markov decision processes within the fluent calculus is given. Based on an idea initially presented in [3], the major components of Markov decision processes such as the optimal value function and a policy are logically represented. The technique produces a set of first-order formulae with equality that minimally partitions the s...
متن کاملSolving Informative Partially Observable Markov Decision Processes
Solving Partially Observable Markov Decision Processes (POMDPs) generally is computationally intractable. In this paper, we study a special POMDP class, namely informative POMDPs, where each observation provides good albeit incomplete information about world states. We propose two ways to accelerate value iteration algorithm for such POMDPs. First, dynamic programming (DP) updates can be carrie...
متن کامل